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•  What’s  New  in  the  HPCMP 

—  New  hardware 

—  HPC  Software  Application  Institutes 
—  Capability  Allocations 
—  Open  Research  Systems 
—  On-demand  Computing 

•  Performance  Measures  -  HPCMP 

•  Performance  Measures  -  Challenges  &  Opportunities 
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Year 


Fiscal  Year  (TI-XX) 


HPC  Center 


System 


Processors 


Major 

Shared 


Army  Research  IBM  P3 

Laboratory  (ARL)  SGI  Origin  3800 

IBM  P4 

Linux  Networx  Cluster 
LNX1  Xeon  Cluster 
IBM  Opteron  Cluster 
SGI  Altix  Cluster 


1,280  PEs 
256  PEs 
512  PEs 
768  PEs 
128  PEs 
256  PEs 
2,100  PEs 
2,372  PEs 
256  PEs 


Resource 

Centers 


Aeronautical 
Systems  Center 
(ASC) 


Compaq  SC-45 
IBM  P3 

COMPAQ  SC-40 
SGI  Origin  3900 
SGI  Origin  3900 
IBM  P4 


836  PEs 
528  PEs 
64  PEs 
2,048  PEs 
128  PEs 
32  PEs 


FY  01  and  earlier 
FY  02 
FY  03 
FY  04 


Engineer  Research 
and  Development 
Center  (ERDC) 


Compaq  SC-40 
Compaq  SC-45 
SGI  Origin  3800 
Cray  T3E 
SGI  Origin  3900 
Cray  XI 


Naval  IBM  P4 

Oceanographic  SV1 

Office  (NAVO)  IBM  P4 


512  PEs 
512  PEs 
512  PEs 
1,888  PEs 
1,024  PEs 
64  PEs 


1,408  PEs 
64  PEs 
3,456  PEs 


"MHPCRC 

ii  Uh 


rf—^ Arctic  Region 
l  Supercomputing  Center 


HPC  Center 

System 

Processors 

Army  High 

Cray  T3E 

1,088  PEs 

FY  01  and  earlier 

Performance 

Cray  XI,  LC 

128  PEs 

Computing  Center 
(AHPCRC) 

64  PEs 

FY  02 

FY  03 

Arctic  Region 

Cray  T3E 

272  PEs 

FY  04  upgrades 

Supercomputing 

Cray  SV1 

32  PEs 

Center  (ARSC) 

IBM  P3 

IBM  Regatta  P4 

200  PEs 
800  PEs 

Cray  XI 

128  PEs 

Why  is  the  date 

Maui  High  Performance 

IBM  P3  (2) 

736/320  PEs 

important? 

Computing  Center 

IBM  Netfinity 

512  PEs 

Generally  we  see 

(MHPCC) 

Cluster 

IBM  P4 

320  PEs 

price-performance 
gains  of  - 1.68 

Space  &  Missile 

SGI  Origins 

1,200  PEs 

(e.g.,  2001  =  1 

Defense  Command 

Cray  SV-1 

32  PEs 

(SMDC) 

W.S.  Cluster 

64  PEs 

2002  =  1.68  x 

IBM  e1300  Cluster 

256  PEs 

2003  =  2.82  x 

Linux  Cluster 

IBM  Regatta  P4 

256  PEs 

32  PEs 

2004  =  4.74  x 

Description 

Location 

System 

(Processors/Memory) 

HP  Superdome 

32  PEs 

Arnold  Engineering 

IBM  Itanium  Cluster 

16  PEs 

Development  Center  (AEDC) 

IBM  Regatta  P4 

64  PEs 

Pentium  Cluster 

8  PEs 

Air  Force  Researh 

Sky  HPC-1 

384  PEs 

Laboratory,  Information 
Directorate  (AFRL/IF) 

Air  Force  Weather  Agency 

IBM  Regatta  P4 

96  PEs 

(AFWA) 

Heterogeneous  HPC 

96  PEs 

Aberdeen  Test  Center  (ATC) 

Powerwulf 

32  PEs 

Powerwulf 

32  PEs 

Fleet  Numerical  Meterology 

SGI  Origin3900 

256  PEs 

and  Oceanography  Center 
(FNMOC) 

IBM  Regatta  P4 

96  PEs 

Joint  Forces  Command 
(JFCOM) 

Xeon  Cluster 

256  PEs 

FY  04  new  systems  and/or  upgrades 


As  of:  April.  2004 


MP  Dedicated  Distributed  Centers 


Location 

System 

Description 

(Processors/Memory) 

Naval  Air  Warfare  Center,  Aircraft 

SGI  Origin  2000 

30  PEs 

Division  (NAWCAD) 

SGI  Origin  3900 

64  PEs 

Naval  Research  Laboratory-DC 

SUN  Sunfire  6800 

32  PEs 

(NRL-DC) 

Cray  MTA 

40  PEs 

SGI  Altix 

128  PEs 

SGI  Origin  3000 

128  PEs 

Redstone  Technical  Test  Center 
(RTTC) 

SGI  Origin  3900 

28  PEs 

Simulations  &  Analysis  Facility 

SGI  Origin  3900 

24  PEs 

(SIMAF) 

Beowulf  Cluster 

Space  and  Naval  Warfare 

Linux  Cluster 

128  PEs 

Systems  Center-San  Diego 
(SSCSD) 

IBM  Regatta  P4 

128  PEs 

Whites  Sands  Missile  Range 
(WSMR) 

Linux  Networx 

64  PEs 

FY  Q4  new  systems  and/or  upgrades 


At  of:  April  2004 
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Name 

Org 

Web  URL 

Contact  Information 

Brad  Comes 

HPCMO 

http://www.hpcmo.hpc.mil 

703-812-8205, 

bcomes@hDcmo.hDC.mil 

Tom  Kendall 

ARL 

MSRC 

http://www.arl.hpc.mil 

410-278-9195 

tkendall@arl.armv.mil 

Jeff  Graham 

ASC 

MSRC 

http://www.asc.hpc.mil/ 

937-904-5135, 

Jeff.Graham@wDafb.af.mil 

Chris  Flynn 

AFRL 
Rome  DC 

http  ://www.  if .  af  r  1 .  af .  m  i  l/tec 
h/facilities/HPC/hpcf.html 

315-330-3249, 

ChristoDher.Flvnn@rl.af.mil 

Dr.  Lynn  Parnell 

SSCSD 

DC 

http://www.spawar.navy. 

mil/sandiego/ 

619-553-1592, 

Darnell@sscsd.hDC.mil 

Maj  Kevin  Benedict 

MHPCC 

DC 

http://www.mhpcc.edu 

808-874-1604, 

Kevin.Benedict@maui.afmc.af.mil 

Retain  third-copy  of  critical  data  at  a  hardened  backup  site  so  users 
can  access  their  files  from  an  alternate  site  in  the  event  of  disruption 

of  their  primary  support  site 


•  Status: 

-  AH  MSRCs,  MHPCC,  and  ARSC  will  have  “off-site” 
third-copy  backup  storage  for  critical  data 

—  Om-going  initiative 

•  Working  with  centers  to  document  the  kinds  of  data  that 
would  need  to  be  recovered 

•  Implementation  to  begin  Q1  FY05 


Facilitates  information  integration 


ARSC 
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PAWCAD 

^ARS 

USJFCOM 


NAVOJ 


•MHPOC 


Enables  Users  to  Easily  Move  Between  Centers  Without  the  Requirement 

to  Learn  and  Adapt  to  Unique  Configurations 


Software  Applications  Support 


HPC  Software 
Applications 
Institutes 


•  Lasting  ii 


Lasting  impact  on  services 
High  value  service  programs 


HPC  Software 
Portfolios 


PET  Partners 
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Transfer  of  new  technologies 
from  universities 

On-site  support 

Training 


Software 

Protection 


•  Tightly  Integrated  software 

%  Address  top  DoD  S&T  and 
T&E  problems 
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Assure  software  intended  use/user 


•  Protect  software  through  source 
insertion 
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Modernization 

Program 


•  5-8  HPC  Software  (Applications) 
Institutes 

—  HPCMP  chartered 

—  Service  managed 

—  3-6  year  duration 

%  Ends  with  Transition  to 
Local  Support 

—  $0.5-3M  annual  funding  for: 

%  3-12  computational  and 
computer  scientists 

m  Support  development  of 
new  and  existing  codes 

%  Adjust  local  business 
practice  to  use  science- 
based  models  &  simulation 

—  Integrated  with  PET 


Biotechnology  HSAI  for  Force  Health  Protection,  MRMC 

Computational  Prediction  of 
Protein  Structure/Function 


Institute  for  Maneuverability  and 
Terrain  Physics  Simulation,  ERDC 


Force  Health  Protection 

Experimentation 


Acoustics/Seismics 


Countermine/IED/UXO 


Threat  Detection/Diagnosis 


Terrain  Mobility 


""  Outcrop 
Tracked  Vehicle 


Simulation 


iy  Mr 

^  «  Revolutionize  Arming  the 

Warfighter 


Battlespace  Environments  Institute, 


Earth  and  Space 
Modeling 


Atmospheric  Model 


Image  Enhancement 


Non-Imaging  Space  Object 
Identification  and  Data  Fusion 


Counterspace/Space-Based 

Surveillance 


Astrodynamlcs 


HPC  Software  Api 


•  Patterned  after  successful  DOE  fellowship  program 

•  National  Defense  Science  and  Engineering  Graduate 


Fellowship  Program  (NDSEG)  chosen  as  vehicle  for 
execution  of  fellowships 

—  HPCMP  added  as  fellowship  sponsor  along  with  Army,  Navy,  and  Air  Force 
—  Computer  and  computational  sciences  added  as  possible  discipline 

•  HPCMP  is  sponsoring  1 1  fellows  for  2004  and  similar 
numbers  each  following  year 

•  HPCMP  fellows  are  strongly  encouraged  to  develop  close  ties 
with  DoD  laboratories  or  test  centers,  including  summer 
research  projects 


•  User  organizations  have  responded  to  DUSD  (S&T)  memo 
with  fellowship  POCs  to  select  and  interact  with  fellows. 


Modernization 

Program 


2&M4  IHIPE&  Comfeteme 

HPCMP  Resource  Allocation  Policy 
Capability  Allocations 

Goal:  Support  the  top  capability  work 

How: 

%  New  Tl-XX  resources  generally  are  implemented  for  a  few  months  before  the  end  of 
the  current  fiscal  year  without  formal  allocation 

%  Dedicate  major  fractions  of  large  new  systems  to  short-term,  massive  computations 
that  generally  cannot  be  addressed  under  normal  shared  resource  operations  for 
the  first  2-3  months  of  life 

%  HPCMP  issued  call  for  short-term  Capability  Application  Project  (CAP)  proposals 

%  Capability  Application  Projects  will  be  implemented  between  October  and 
December  on  large  new  systems  each  year 

—  Proposals  are  required  to  show  that  the  application  efficiently  used  on  the 
order  of  1,000  processors  or  more  and  would  solve  a  very  difficult,  important 
short-term  computational  problem 


•  Cal  released  to  HPCMP  community  on  22  April  2004  with  responses  sent 
to  HPCMPO  by  1  June  2004 

—  21  proposals  received  across  all  large  CTAs  (CSM,  CFD,  CCM,  CEA, 
and  CWO) 

•  CAPs  will  be  run  on  new  3,000  processor  Power4+  at  NAVO,  2,100 
processor  Xeon  and  2,300  processor  Opteron  clusters  at  ARL 

%  CAPs  will  be  run  in  two  phases: 

—  Exploratory  phase  designed  to  test  scalability  and  efficiency  of 
application  codes  to  significant  fractions  of  systems  (5-15  projects  on 
each  system) 

—  Production  phase  designed  to  accomplish  significant  capability  work 
with  efficient,  scalable  codes  (1-3  projects  on  each  system) 

•  Production  phase  of  CAPs  will  be  run  after  normal  acceptance  testing  and 
pioneer  work  on  these  systems 


2004  MP€&  Conference 

“Open  Research’*  Systems 

•  In  response  to  customer  demand:  —  ~  50%  of  Challenge  Project  leaders  prefer  to  use 
an  “open  research”  system 

%  “Open  Research”  systems  concentrate  on  basic  research  allowing  better  separation  of 
sensitive  and  non-sensitive  information 

—  minimal  background  check  facilitating  graduate  student  and  foreign  national 

access 

•  For  FY05  the  systems  at  ARSC  will  transition  into  an  “open  research”  mode  of 
operation 

—  Eliminate  the  requirement  for  users  of  that  system  to  have  NACs 

—  Customers  would  have  to  “certify”  that  there  work  is  unclassified  non-sensitive 
(e.g„  open  literature,  basic  research) 

—  All  other  operational  and  security  policies  apply,  such  as  all  users  of  HPCMP 
resources  must  be  valid  DoD  users  assigned  to  a  DoD  computational  project 

—  Consistent  with  Uniform  Use- Access  Policy 

%  The  account  application  process  for  “open  research”  centers  or  systems  require 

certification  by  government  program  manager  that  computational  work  is  cleared  for 
open  literature  publication 

—  Component  of  FY  2005  account  request 

%  Operations  on  all  other  systems  remain  under  current  policies 


%  "Real-time"  community  has  asked  for  "guaranteed"  or  on-demand 
service  from  shared  resource  centers 

—  Request  is  aimed  at  ensuring  quick  response  time  from  shared 
resource  when  system  is  being  used  interactively 

—  Results  needed  now  —  can’t  wait 

%  Current  policy  requires  that  all  Service/Agency  work,  be  covered  by 
an  allocation 

—  Note:  "On-demand"  system  will  have  lower  utilization  but  fast 
turn  around 

—  Service  "valuation"  of  this  service  demonstrated  by  FY05 

allocations  —  need  sufficient  allocation  to  dedicate  a  system  to  this 
mode  of  support 

%  Anticipating  the  Services/Agencies  will  allocate  sufficient  time  to 
dedicate  one  256  processor  cluster  at  ARL 


•  Goal:  Assess  the  potential  value  and  cost  of  providing 
greater  interactive  access  to  HPC  resources  to  the  DoD 
RDT&E  community  and  its  contractors. 

•  Means:  Provide  both  unclassified  and  classified  distributed 
HPC  resources  to  the  DoD  HPC  community  in  FY05  for 
interactive  experimentation  exploring  new  applications 
and  system  configurations 


Legend 

□  Remote  Users 
^  Networked  HPC’s 

Unclassified 
System  in  Black 

Classified 
Systems  in  Red 


MHPCC 

Koa 

Cluster 

Koa 

Cluster 


S Distributed  HPC’s 
Accessed  by  authorized  users  anywhere  on  the  DREN  and  Internet 
^ Interactive  and  time  critical  problems 


•  Low  latency  support  for  interactive  and  real-time 
applications — proper  HPC  configuration? 


•  Cohabitation  of  interactive  and  batch  jobs? 

•  Web-based  access  to  network  of  HPC’s  with  enhanced 
usability 

•  Consistency  with  HPCMP  approved  secure  environment 
using  DREN  and  SDREN 

•  Information  management  system  supporting  distributed 
HPC  applications 

•  Demonstrating  new  C4ISR  applications  of  HPC 


•  Expanding  FMS  use  beyond  Joint  experimentation  to 


include  training  and  mission  rehearsal 


Modernization 
Program 


•  Objectives"  to  provide  SIP  users  with  a  High  Productivity 
Interactive  Parallel  MATLAB  environment  (it  will  provide  the 
user-friendly  MATLAB  high-level  language  syntax  plus  the 
computational  power  of  the  interactive  HPCs) 

•  To  alow  interactive  experiments  for  demanding  SIP  problems: 
problems  that  take  too  long  to  finish  on  a  single  Workstation,  or 
that  require  more  memory  than  what  is  available  on  a  single 
computer,  or  systems  with  both  constrains  in  which  users5 
research  may  benefit  by  an  Interactive  modus-operand* 

•  Approach:  to  use  MatlabMPI  or  other  Parallel  MATLAB  viable 
approaches  to  deliver  parallel  execution  but  keeping  the  familiar 
MATLAB  interactive  environment 


i 


•  It  may  serve  as  a  vehicle  to  collect  experimental  data  about 
productivity  issues:  are  SIP  users  really  more  productive  on  such 
an  Interactive  MFC  MATLAB  platform?  (versus  the  traditional 
batch  oriented  HPCs) 


Site 

Computer 

Memory  and  I/O 

Online 

ARL  MSRC 

Unclass-  Powell:  128  node  Dual  3.06MHz 

2  GB  DRAM  and  64  GB 

Est.  10/04  w/batch; 

Aberdeen,  MD 

Xeon  Cluster 

disk/node,  Myrinet  &  GigEnet/ 
100MB  Backplane 

4/05  share  with  batch, 

ASC  MSRC 
Dayton,  OH 

Unclass-  Mach2: 24  node  Dual  2.66  GHz 
Xeon,  Linux 

Class-Glenn:  128  node  dual  Xeon,  Linux 

4  GB  DRAM  and  80  GB 
disk/node ,  dual  GigEnet 

4  GB  DRAM  and  local  disks 

Est.  10/04 

Est.  Spring/05 

AFRL 

Rome,  NY 

Unclass-  Coyote:  26  node  Dual  3.06GHz 
Xeon,  Linux 

Class-  Wile:14  node  Dual  2.66/3.06  GHz 
Xeon,  Linux 

6  GB  DRAM  and  400  GB 
disk/node,  dual  GigEnet 

6  GB  DRAM  and  200  GB 
disk/node,  dual  GigEnet 

Yes 

Est.  12/04 

SSCSD 

San  Diego,  CA 

Unclass-  Seahawk:  16  node  1.3GHz 
Itanium2,  Linux 

Class-  Seafarer:  24  node  Dual  3.06  GHz 

2  GB  DRAM  and  36  GB 
disk/node,  dual  GigEnet 

4  GB  DRAM  and  80  GB 
disk/node,  dual  GigEnet 

Est.  12/04 

Yes  (U)  til  3/05 

MHPCC 

Maui,  HI 

Unclass/Class-  Koa:  128  node  dual  Xeon, 
Linux  (system  moves  between 
environments) 

4  GB  DRAM  and  80  GB 
disk/node,  shared  file  system, 
dualGigEnet 

Yes 

/IJM  > 

Modernization 

listil  Pronrnm 
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Name 

Program 

Dr.  Richard  Linderman 

HPC  for  Information  Management 

Dr.  Bob  Lucas 

USJFCOM  J9 

Dr.  Stan  Ahalt 

PET-  SIP  CTP 

Dr.  Juan  Carlos  Chaves 

Interactive  Parallel  MATLAB 

Dr.  Dave  Pratt 

SB  A  Force  transformations 

Rob  Ehret 

Bill  McQuay 

Grid-based  Collaboration 

Dr.  John  Nehrbass 

Web  enabled  HPC 

Dr.  Keith  Bromley 

Signal  Image  Processing 

Dr.  George  Ramseyer 

Hyperspectral  Image  Exploitation 

Richard  Pei 

Interactive  Electromagnetics  Sim 

Dr.  Ed  Zelnio 

3-D  SAR  Radar  Imagery 

John  Rooks 

Swathbuckler  SAR  Radar  Imagery 

Contact  Information 


315-330-2208,  Richard.Linderman@rl.af.mil 

310-448-9449,  rflucas@isi.edu 

614-292-9524,  ahalt@osc.edu 

410-278-7519,  ichaves@arl.armv.mil 

407-243-3308,  David.R.Pratt@saic.com 

937-904-9017,  Robert.Ehret@sensors.wpafb.af.mil 
937-904-9214,  William.Quav@sensors.wpafb.af.mil 

937-904-5139,  John.Nehrbass@wpafb.af.mil 

619-553-2535,  bromlev@spawar.navy.mtl 

315-330-3492,  Georqe.Ramsever@rl.af.nriil 

732-532-0365,  Richard.Pei@us.army.mil 

937-255-4949  ext.4214,  Ed  Zelnio@mbvlab.wpafb.af.mil 

315-330-2618,  John.Rooks@rl.af.mil 


Modernization 


HPCMP  Benchmarking  and 


Code  to  Existing 
>\ent  Code  0©^ 
finance  Mode/ 


Level  1 

Application 

Code 

Profiling 


•  Provide  Quantitative  measures  to 
support  selection  of  computers  in 
annual  procurement  process  (TI-XX) 

—  Develop  an  understanding  off  our 
key  application  codes  for  the 
purpose  of  guiding  code 
developers  and  users  toward 
more  efficient  applications  and 
machine  assignments 

—  Replace  the  current  application 
benchmark  suite  with  a  judicious 
choice  of  synthetic  benchmarks  that 
could  be  used  to  predict  performance  of  any  HPC 
architecture  on  the  program’s  key  applications 


ernization 


Program 


Operations  Decisions 
Acquisition  Decisions 


%  Direct  feedback  from  M  and 
individual  users 

%  Summary  report  sent  to.  each 
MFC  Center 

%  Issue  addressed  and  resolved 

%  User  satisfaction  impacts 
requirements,  allocation, 
and  utilization  statistics 
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Technology  Insertion  (TI)  Flow  Chart 


Requirements 
Update 


Update 

Acquisition 

Plan 


s> 


Vendors 
prepare  bids 
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•  Synthetic  codes 

—  Basic  hardware  and  system  performance  tests 


—  Meant  to  determine  expected  future  performance 

—  Scalable,  quantitative  synthetic  tests  will  be  used  for 
scoring  and  others  will  be  used  as  system 
performance  checks  by  Usability  Team 

•  Application  codes 

—  Actual  application  codes  as  determined  by 
requirements  and  usage 

—  Meant  to  indicate  current  performance 


CTA 

Requirements 

Percentage 

FY  [2002]  (2003)  {2004} 

Usage 

Percentage 

FY2002 

{2003} 

Allocation 

Percentage 

FY2003 

{2004} 

Average 

(25%  FY  2004  Req,  25% 
FY  2003  Usage,  50%  FY 
2004  Alloc) 

FY  [2002]  (2003)  {2004} 

CFD 

[35.5%]  (36.9%)  {38.6%} 

48.3%  {37.2%} 

40.7%  {44.4%} 

[43.3%]  (41.6%)  {41.2%} 

CCM 

[15.5%]  (18.6%)  {16.2%} 

16.4%  {21 .2%} 

14.2%  {12.6%} 

[14.2%]  (15.9%]  {15.7%} 

CWO 

[21.9%]  (19.2%)  {20.8%} 

21.3%  {23.1%} 

21.9%  {17.6%} 

[23.3%]  (21.1%)  {19.8%} 

CEA 

[4.1%]  (4.0%)  {4.8%} 

5.1%  {4.8%} 

8.2%  {6.6%} 

[4.9%]  (6.4%)  {5.7%} 

CSM 

[11.4%]  (11.8%)  {11.7%} 

3.5%  {7.5%} 

9.6%  {11.0%} 

[8.3%]  (8.6%)  {10.3%} 

EQM 

[3.0%]  (3.2%)  {2.1%} 

0.6%  {1.6%} 

4.0%  {3.1%} 

[2.3%]  (3.0%)  {2.4%} 

SIP 

[1.0%]  (1.4%)  {1.4%} 

1.2%  {1.1%} 

0.2%  {0.4%} 

[0.4%]  (0.7%)  {0.8%} 

CEN 

[0.5%]  (0.4%)  {0.6%} 

1 .3%  {1 .2%} 

0.1%  {1.2%} 

[1.4%]  (0.5%)  {1.1%} 

IMT 

[2.9%]  (0.8%)  {0.8%} 

2.1%  {0.7%} 

0.7%  {1.9%} 

[0.9%]  (1.1%)  {1.3%} 

Other 

[1.3%]  (1.2%)  {0.2%} 

0.1%  {0.8%} 

0.2%  {0.7%} 

[0.4%]  (0.4%)  {0.6%} 

FMS 

[2.9%]  (2.6%)  {2.9%} 

0.2%  {0.8%} 

0.2%  {0.4%} 

[0.7%]  (0.8%)  {1.1%} 

%  Aero  —  Aeroelasticity  CFD  code  (single  test  case) 

(Fortran,  serial  vector,  15,000  lines  of  code) 

%  AVUS  (Cobalt-60)  -  Turbulent  flow  CFD  code 
(Fortran,  MPT,  19,000  lines  of  code) 

%  GAMESS  -  Quantum  chemistry  code 
(Fortran,  MPI,  330,000  lines  of  code) 

%  HYCOM  —  Ocean  circulation  modeling  code 
(Fortran,  MPI,  31,000  lines  of  code) 

•  OOCore  -  Out-of-core  solver 
(Fortran,  MPI,  39,000  lines  of  code) 

•  RFCTH2  -  Shock  physics  code 

(-43%  Fortran/~57%  C,  MPI,  436,000  lines  of  code) 

%  WRF  -  Multi-Agency  mesoscale  atmospheric  modeling  code  (single  test  case) 
(Fortran  and  C,  MPI,  100,000  lines  of  code) 

%  Overflow-2  -  CFD  code  originally  developed  by  NASA 

(Fortran  90,  MPI,  83,900  lines  of  code) 


CTA 

Benchmark 

Size 

Unclassified  % 

Classified  % 

CSM 

RF-CTH 

Standard 

a% 

A% 

CSM+CFD 

RF-CTH 

Large 

b% 

B% 

CFD 

Cobalt60 

Standard 

c% 

C% 

CFD 

Cobalt60 

Large 

d% 

D% 

CFD 

Aero 

Standard 

e% 

E% 

CEA+SIP 

OOCore 

Standard 

f% 

F% 

CEA+SIP 

OOCore 

Large 

g% 

G% 

CCM+CEN 

GAMESS 

Standard 

h% 

H% 

CCM+CEN 

GAM ESS 

Large 

i% 

1% 

CCM 

NAMD 

Standard 

j% 

J% 

CCM 

NAMD 

Large 

k% 

K% 

CWO 

HYCOM 

Standard 

1% 

L% 

CWO 

HYCOM 

Large 

m% 

M% 

Total 

100.00% 

100.00% 

•  Establish  a  DoD  standard  benchmark  time  for  each 
application  benchmark  case 

—  NAVO  IBM  Regatta  P4  (Marcellas)  chosen  as  standard  DoD 
system  for  TI-04  (Initially  IBM  SP3  —  HABL-) 

•  Benchmark  timings  (at  least  three  on  each  test  case)  are 
requested  for  systems  that  meet  or  beat  the  DoD 
standard  benchmark  times  by  at  least  a  factor  of  two 
(preferably  up  to  four) 

•  Benchmark  timings  may  be  extrapolated  provided  they 
are  guaranteed,  but  at  least  one  actual  timing  on  the 
offered  or  closely  related  system  must  be  provided 
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HPCMP  System  Performance 
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Optimize  Total  Price/Performance 


System 

Total  # 

Proc 

Opt#  1 

Opt  #2 

Opt  #3 

Opt  #4 

A 

64 

1 

1 

0 

0 

B 

188 

0 

2 

3 

0 

C 

128 

0 

0 

0 

4 

C 

256 

0 

2 

4 

0 

D 

256 

15 

0 

0 

12 

D 
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0 

4 

1 
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E 

256 

1 

1 

3 

0 

Performance  /  Life  Cycle 

3.03 

3.02 

2.97 

2.95 

The  optimizer  produces  a  list  of  system  solutions  in  rank 
order  based  upon  Performance  /  Life  Cycle  Cost 
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Requirement  Trends 
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The  slope  of  this  semi-log  plot  for  the  entire  set  of  data  equates  to  a 
constant  factor  of  (1/76+0.26),  although  the  slopes  for  the  last  two  years 
have  been  1.42  and  1.48,  respectively. 


Supercomputer  Pri ee-  Performiauiee  Trends 


High  Performance  Computing  Modernization  Program 
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Benchmarks 

Today 


Tomorrow 


Dedicated  Applications 

•  80%  weight 

•  Real  codes 

•  Representative  data  sets 

Synthetic  Benchmarks 

%  20%  weight 
%  Future  look: 


Synthetic  Benchmarks 

•  100%  weight 

•  Coordinated  to  application 
“signature” 

•  Performance  on  real  codes 
accurately  predicted  from 
synthetic  benchmark  results 


•  Focus  on  key  machine  features  I 

i 

/ 

_ x 


Supported  hy  genuine  “signature” 
databases 


Next  1-2  years  key  —  must  prove  that  synthetics  benchmarks  and 

application  “signatures”  can  be  coordinated 


•  Began  at  behest  of  HPC  User  Forum  in  partnership  with  NS  A 

%  Has  evolved  to  multi-year  plan  —  how  key  application  codes  perform  on  HPC  systems 
—  Maximizing  use  of  current  HPC  resources 
—  Predicting  performance  of  future  HPC  resources 

%  Performers  include 

—  Programming  Environment  and  Training  (PET)  partners 
—  Performance  Modeling  and  Characterization  Laboratory  (PMaC)  at  SDSC 
—  Computational  Science  and  Engineering  Group  at  ERDC 
—  Instrumental,  Inc. 


%  Research  and  production  activities  include 


—  Profiling  key  DoD  application  codes  at  several  different  levels 

—  Characterizing  HPC  systems  with  a  set  of  system  probes  (synthetic  benchmarks) 

—  Predicting  HPC  system  performance  based  on  application  profiles 

—  Determining  a  minimal  set  of  HPC  system  attributes  necessary  to  model  performance 


Constructing  the  appropriate  set  of  synthetic  benchmarks  to  accurately ^modol-the. 
HPCMP  computational  workload  to  use  In  system  acquisitions 


Support  for  TI-05  (Scope:  and  Schedule) 


i 


%  Level  3  application  code  profiling 

—  Eight  application  codes  — 14  unique  test  cases 
—  Each  test  case  to  be  run  at  3  different  processor  counts 

%  Predictions  for  existing  systems 

—  21  systems  at  7  centers  (some  overlap  possible  in  predictions) 

—  Benchmarking  POCs  identified  for  each  center 
—  Goals  benchmarking  results  and  predictions  complete  by  Bee  2004 

•  Predictions  for  offered  systems 

—  Goal:  benchmarking  results  finalized  by  19  November  2004;  all 
predictions  completed  by  31  December  2004 

%  Sensitivity  Analysis 

—  Goal:  Determine  how  accurate  a  prediction  do  we  need. 


2M4  W/P£€  Cumfepeme 

Performance  Prediction  Uncertainty  Analysis 

•  Overall  goal:  Understand  and  accurately  estimate  uncertainties  in 
performance  predictions 

•  Determine  functional  form  of  performance  prediction  equations  and 
develop  uncertainty  equation 

•  Determine  uncertainties  in  underlying  measured  values  from  system 
probes  and  application  profiling  and  use  uncertainty  equation  to 
estimate  uncertainties 

•  Compare  results  of  performance  prediction  to  measured  timings  and 
uncertainties  of  these  results  to  predicted  uncertainties 

•  Assess  uncertainties  in  measured  timings  and  determine  whether 
acceptable  agreement  is  obtained 

•  Eventual  goal:  propagate  uncertainties  in  performance  prediction  to 
determine  uncertainties  in  acquisition  scoring 


•  Assumption:  Uncertainties  in  measured  performance 
values  can  be  treated  as  uncertainties  in  measurements  of 
physical  quantities 

•  For  small,  random  uncertainties  in  measured  values  x,  y,  z, 

the  uncertainty  in  a  calculated  function  q  (x,  y,  z  can 
be  expressed  as: 
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•  + 

f  dq  \ 
—  oz 

IV 8x  ) 

dz  ) 

•  Systematic  errors  need  careful  consideration  since  they 
cannot  be  calculated  analytically 
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Performance  Measurement — Closing  Thoughts 

O  Clearly  identify  your  goals 

—  Maximize  the  amount  of  work  given  fixed  $  and  time. 

—  Alternative  goals:  power  consumption,  weight,  volume 

•  Define  Work  Flow 

—  Production  (run)  time 

—  Alternative  goals:  development  time,  problem  set-up 
time,  result  analysis  time 

O  Validate  Measures 

—  Understand  the  error  bounds 

•  Don’t  rely  on  “Marketing”  specifications! 


